knitr::opts_chunk$set(echo = TRUE)
\newpage
This walkthrough demonstrates how to perform an RERconverge analysis using only the data at the tips of the tree, skipping the phylogenetic inference step of a typical RERconverge analysis. This walkthrough builds on existing RERconverge objects. First time users should first read the "RERconverge Analysis Walkthrough" vignette for information on installation, setup, and getting started.
Typically, we recommend including ancestral states in an RERconverge analysis because incorporating evolutionary information can strengthen the statistical power of the analysis. However, there may be phenotypes in which ancestral states are not as informative and could add noise to the results. In that case, we present a method for calculating association statistics between relative evolutionary rates and phenotype values using only the extant species in the tree.
The required inputs are as follows:
Phylogenetic trees of the same format described in the "RERconverge Analysis Walkthrough" vignette.
Species-labeled phenotype values
The species labels MUST match the tree tip labels that will be used
in getAllResiduals
to calculate the relative evolutionary rates
(RERs)
Refer to the "RERconverge Analysis Walkthrough"
vignette
to learn how to read in gene trees using readTrees
and calculate
evolutionary rates using getAllResiduals
.
Running the code below will read in some example trees that come with the RERconverge package that we will use for this walkthrough.
# check RERconverge is properly installed library(RERconverge) # find where the package is located on your machine rerpath = find.package('RERconverge') # read in the trees with the given file name toytreefile = "subsetMammalGeneTrees.txt" toyTrees=readTrees(paste(rerpath,"/extdata/",toytreefile,sep=""), max.read = 200) # calculate the relative evolutionary rates with getAllResiduals RERmat = getAllResiduals(toyTrees)
First, we define foreground species for the hibernation binary phenotype and generate a named phenotype vector.
library(RERconverge) # define the foreground species hibextantforeground = c("Vole", "Brown_bat", "Myotis_bat", "Squirrel", "Jerboa") # make a phenotype vector for the species in the tree # the phenotype values must be numeric (0 and 1 instead of TRUE and FALSE) hibphenvals = rep(0, length(toyTrees$masterTree$tip.label)) names(hibphenvals) = toyTrees$masterTree$tip.label # set the foreground species to true hibphenvals[hibextantforeground] = 1
Finally, we calculate statistics using getAllCorExtantOnly
which takes
the following as input:
RERmat
: The RER matrix returned by getAllResiduals
.
phenvals
: the named phenotype vector with names matching those
used to calculated RERs in getAllResiduals
.
method
: set to "k"
for binary traits to calculate Kendall rank
coefficients, "p"
for continuous traits to use a Pearson
correlation, and "aov"
or "kw"
for categorical traits to use an
ANOVA or Kruskal Wallis test respectively.
min.sp
: The minimum number extant species in the gene tree for
that gene to be included in the analysis.
min.pos
: The minimum number of extant foreground species in the
gene tree for that gene to be included in the analysis.
winsorizeRER/winsorizetrait
: pulls the most extreme N values
(default N=3) in both the positive and negative tails to the value
of the N+1 most extreme value. This process mitigates the effect of
extreme outliers before calculating correlations.
# set method to k to use a Kendall rank test since this is a binary phenotype cors = getAllCorExtantOnly(RERmat, hibphenvals, method = "k") # view the top results head(cors[order(cors$P),])
For further analysis of the gene results, such as calculating functional enrichments, refer to the "RERconverge Analysis Walkthrough" vignette.
We will follow much the same steps for continuous traits as we did for binary traits. First, ensure that you followed the instructions above for reading in the gene trees and calculating relative evolutionary rates.
Next, we will load in some example data provided by RERconverge for the mammal body weight phenotype and calculate association statistics.
# load in the example data data("logAdultWeightcm") # set method to p to use a Pearson correlation since this is a continuous phenotype cors = getAllCorExtantOnly(RERmat, phenvals = logAdultWeightcm, method = "p") # view the top results head(cors[order(cors$P),])
For further analysis of the gene results, such as calculating functional enrichments, refer to the "RERconverge Analysis Walkthrough" vignette.
Once again, we will follow very similar steps for categorical traits as for both binary and continuous traits. First, ensure that you followed the instructions above for reading in the gene trees and calculating relative evolutionary rates.
Next, we will load in some example data provided by RERconverge for the basal rate phenotype and calculate association statistics.
# load in the example data data("basalRate") # set method to kw to use a Kruskal Wallis test since this is a categorical phenotype cors = getAllCorExtantOnly(RERmat, phenvals = basalRate, method = "kw")
Finally, we can view the results for all categories or for the pairwise comparisons between categories.
# the first element of cors is a table of association statistics for the Kruskal Wallis # or ANOVA test across categories all_categories_results = cors[[1]] # view top results head(all_categories_results[order(all_categories_results$P),]) # the second element of cors is a list of tables of pairwise comparisons between categories pairwise_tests = cors[[2]] names(pairwise_tests) # view top results of pairwise test between low and medium basal rate species head(pairwise_tests[[3]][order(pairwise_tests[[3]]$P),])
This concludes the walkthrough on how to calculate association statistics between phenotype and relative evolutionary rates with only the extant species in the tree.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.